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Abstract 

The nonnegative rank of a nonnegative matrix is the minimum number of nonnegative rank-one 
factors needed to reconstruct it exactly. The problem of determining this rank and computing the 
corresponding nonnegative factors is difficult; however it has many potential applications, e.g., in 
data mining, graph theory and computational geometry. In particular, it can be used to characterize 
the minimal size of any extended reformulation of a given combinatorial optimization program. In 
this paper, we introduce and study a related quantity, called the restricted nonnegative rank. We 
show that computing this quantity is equivalent to a problem in polyhedral combinatorics, and fully 
characterize its computational complexity. This in turn sheds new light on the nonnegative rank 
problem, and in particular allows us to provide new improved lower bounds based on its geometric 
interpretation. We apply these results to slack matrices and linear Euclidean distance matrices 
and obtain counter-examples to two conjectures of Beasly and Laffey, namely wc show that the 
nonnegative rank of linear Euclidean distance matrices is not necessarily equal to their dimension, 
and that the rank of a matrix is not always greater than the nonnegative rank of its square. 
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1 Introduction 

The nonnegative rank of a m x n real nonnegative matrix M G IR™ xn is the minimum number of 
nonnegative rank-one factors needed to reconstruct M exactly, i.e., the minimum k such that there 
exists U G M+ Xfc and V G M+ Xn with M = UV = E^-i^W- The pair (U,V) is called a rank-A; 
nonnegative factorization^ of M. The nonnegative rank of M is denoted rank + (M). Clearly, 

rank(M) < rank + (M) < min(ra, n). 

Determining the nonnegative rank and computing the corresponding nonnegative factorization is a 
relatively recently studied problem in linear algebra [HE]. In the literature, much more attention 
has been devoted to the approximate nonnegative factorization problem (called nonnegative matrix 
factorization, NMF for short [26]) consisting in finding two low-rank nonnegative factors U and V 
such that M ~ UV or, more precisely, solving 



min \\M-UV\\ F . 

t/eK+ xfc ,VeK+ x ™ 



1 Universite catholique de Louvain, CORE, B-1348 Louvain-la-Neuve, Belgium. E-mail: nicolas.gillis@uclouvain.be 
and francois.glineur@uclouvain.be. Nicolas Gillis is a research fellow of the Fonds de la Recherche Scientifique (F.R.S.- 
FNRS). This text presents research results of the Belgian Program on Interuniversity Poles of Attraction initiated by 
the Belgian State, Prime Minister's Office, Science Policy Programming. The scientific responsibility is assumed by the 
authors. 

1 Notice that matrices U and V in a rank-fc nonnegative factorization are not required to have rank k. 
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NMF has been widely used as a data analysis technique [5], e.g., in text mining, image processing, 
hyperspectral data analysis, computational biology, clustering, etc. Nevertheless, there are not too 
many theoretical results about the nonnegative rank and better characterizations, in particular lower 
bounds, could help practitioners. For example, efficient computations of nonnegative factorizations 
could help to design new NMF algorithms using a two-step strategy (33] : first approximate M with 
a low-rank nonnegative matrix A (e.g., using the singular value decompositiord) and then compute a 
nonnegative factorization of A. Bounds for the nonnegative rank could also help select the factoriza- 
tion rank of the NMF, replacing the trial and error approach often used by practitioners. For example, 
in hyperspectral image analysis, the nonnegative rank corresponds to the number of materials present 
in the image and its computation could lead to more efficient algorithms detecting these constitutive 
elements, see [7J HH [231 EI] and references therein. 

An extended formulation (or lifting) for a polytope P C M n is a polyhedron Q C M n+P such that 

P = proj x .(Q) := {x G R n \ 3y G R p s.t. (x,y) G Q}. 

Extended formulations whose size (number of constraints plus number of variables defining Q) is 
polynomial in n are called compact and are of great importance in integer programming. They allow 
to reduce significantly the size of the linear programming (LP) formulation of certain integer programs, 
and therefore provide a way to solve them efficiently, i.e., in polynomial-time (see [13] for a survey). 
Yannakakis |36[ Theorem 3] showed that the minimum size s of an extended formulation of a polytopqj 

P = {x€R n \Cx>d,Ax = b}, 

is of the same order as the sum of its dimension n and the nonnegative rank of its slack matrix Sm > 0, 
where each column of the slack matrix is defined as 

Sm('-, i) = Cvi — d > 0, i = l,2,...,m, (1.1) 

and vectors V{ are the m vertices of the polytope P. Formally, we then have 

s = @(n + rank + (S , M))- 

In particular, any rank-A: nonnegative factorization (U, V) of Sm = UV provides the following extended 
formulation for P with size 0(n + k) 

Q = y ) e R n + k \Cx-Uy = d,Ax = b,y> 0}. (1.2) 

In fact, W°i x (Q) ^ P since Uy > implies Cx > d for any x G proj 2 .(Q), and P C proj a ,(Q) since 
Cvi — UV(:,i) = d implies that (vi,V(:,i)) G Q for all i and therefore each vertex Vi of P belongs 
to proj x (Q). Intuitively, this extended formulation parametrizes the space of slacks of the original 
polytope with the convex cone {Uy \ y > 0}. 

It is therefore interesting to compute bounds for the nonnegative rank in order to estimate the size 
of these extended formulations. Recently, Goemans [22] used this result to show that the size of LP 
formulations of the permutahedron (polytope whose re! vertices are permutations of [1,2,..., re]) is at 
least f2(ralog(ra)) variables plus constraints (cf. Section [3]). 

We will see in Section lUTTl that the nonnegative rank is closely related to a problem in computational 
geometry that consists in finding a polytope with minimum number of vertices nested between two 

2 Even though the optimal low-rank approximation of a nonnegative matrix might not necessarily be nonnegative 
(except in the rank-one case), it is often the case in practice |24| . 
3 This can be generalized to polyhedra [15] , 
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given polytopes. Therefore a better understanding of the properties of the nonnegative rank would 
presumably also allow to improve characterization of the solutions to this geometric problem. 

The nonnegative rank also has connections with other problems, e.g., in communication complexity 
theory J3S1 EZ] , probability [8] , and graph theory (cf . Section ED . 

The main goal of this paper is to provide improved lower bounds on the nonnegative rank. In 
Section [2j we introduce a new related quantity called restricted nonnegative rank. Generalizing a 
recent result of Vavasis [33] (see also |28j). we show that computing this quantity is equivalent to a 
problem in polyhedral combinatorics, and fully characterize its computational complexity. In Sec- 
tion [3l based on the geometric interpretation of the nonnegative rank and the relationship with the 
restricted nonnegative rank, we derive new improved lower bounds for the nonnegative rank. Finally, 
in Section [U we apply our results to slack matrices and linear Euclidean distance matrices. We obtain 
counter-examples to two conjectures of Beasly and Laffey [2J, namely we show that the nonnegative 
rank of linear Euclidean distance matrices is not necessarily equal to their dimension, and that the 
rank of a matrix is not always greater than the nonnegative rank of its square. 

Notation. The set of real matrices of dimension m by n is denoted M mxn ; for A £ M mxn , we denote 
the z th column of A by A-i or A(:,i), the j th row of A by Aj : or A(j,:), and the entry at position 
(i,j) by Aij or A(i,j); for b £ M mxl = W n , we denote the i th entry of b by 6j. Notation A(I,J) 
refers to the submatrix of A with row and column indices respectively in / and J, and a:b is the 
set {a, a + 1, . . . , b — 1, 6} (for a and b integers with a < b). The set W nxn with component-wise 
nonnegative entries is denoted M.™ xn . The matrix A T is the transpose of A. The rank of a matrix A 
is denoted rank(^4), its column space col(A). The convex hull of the set of points S, or the convex 
hull of the columns of the matrix S are denoted conv(S'). The number of vertices of the polytope 
Q is denoted by # vertices (Q). The concatenation of the columns of two matrices A £ ]g> mxn arK j 
B £ M. mxp is denoted [AB] £ R mx ( n +p). The sparsity pattern of a vector is the set of indices of its 
zero entries (it is the complement of its support). 

2 Restricted Nonnegative Rank 

In this section, we analyze the following quantity 

Definition 1. The restricted nonnegative rank of a nonnegative matrix M is the minimum value 
of k such that there exists U £ M+ xfe and V £ M+ Xn with M = UV and rank(*7) = rank(M), i.e., 
col(C/) = col(M). It is denoted rank+(M). 

In particular, given a nonnegative matrix M, we are interested in computing its restricted non- 
negative rank rank^(M) and a corresponding nonnegative factorization, i.e., solve 

(RNR) Given a nonnegative matrix M £ M™ xn , find k = rank^_(M) and compute U £ 
M™ xfe and V £ M+ Xn such that M = UV and rank(C7) = rank(M) = r. 

Without the rank constraint on the matrix U, this problem reduces to the standard nonnegative rank 
problem. Motivation to study this restriction includes the following 

1. The restricted nonnegative rank provides a new upper bound for the nonnegative rank, since 
rank + (M) < rank^(M). 

2. The restricted nonnegative rank can be characterized much more easily. In particular, its geo- 
metrical interpretation (Section 12. ip will lead to new improved lower bounds for the nonnegative 
rank (Sections [3] and 
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RNR is a generalization of exact nonnegative matrix factorization (exact NMF) introduced by 
Vavasis [33]. Noting r = rank(M), exact NMF asks whether rank + (M) = r and, if the answer is 
positive, to compute a rank-r nonnegative factorization of M. If rank + (M) = r then it is clear that 
rank*j_(Af ) = rank + (M) since the rank of U in any rank-r nonnegative factorization (U, V) of M must 
be equal to r. 

Vavasis studies the computational complexity of exact NMF and proves it is NP-hard by showing its 
equivalence with a problem in polyhedral combinatorics called intermediate simplex. This construction 
requires both the dimensions of matrix M and its rank r to increase to obtain NP-hardness. This 
result also implies NP-hardness of RNR when the rank of matrix M is not fixed. However, in the case 
where the rank r of matrix M is fixed, no complexity results are known (except in the trivial cases 
r = 1, 2 |32j). The situation for RNR is quite different: we are going to show that RNR can be solved 
in polynomial-time when r = 3 and that it is NP-hard for any fixed r > 4. In particular, this result 
implies that exact NMF can be solved in polynomial-time for rank-three nonnegative matrices. 

In order to do so, we first show equivalence of RNR with another problem in polyhedral com- 
binatorics, closely related to intermediate simplex (Section 12. ip . and then apply results from the 
computational geometry literature to conclude about its computational complexity for fixed rank 
(Section [2J|. 

2.1 Equivalence with the Nested Polytopes Problem 

Let consider the following problem called nested polytopes problem (NPP): 

(NPP) Given a bounded polyhedron 

P = {x G W' 1 | < f(x) = Cx + d}, 

with (C d) G M mxr of rank r, and a set S of n points in P not contained in any hyperplane 
(i.e., conv(S') is full-dimensional), find the minimum number k of points in P whose convex 
hull T contains S, i.e., S C T C P. 

Polytope P is referred to as the outer polytope, and conv(S') as the inner polytope; note that they 
are given by two distinct types of representations (faces for P, extreme points for conv(S')). 

The intermediate simplex problem mentioned earlier and introduced by Vavasis [33] is a particular 
case of NPP in which one asks whether k is equal to r (which is the minimum possible value), i.e., 
if there exists a simplex T (defined by r vertices in a r — 1 dimensional space) contained in P and 
containing S. 

We now prove equivalence between RNR and NPP. It is a generalization of the result of Vavasis 
|33j who showed equivalence of exact NMF and intermediate simplex. 

Theorem 1. There is a polynomial-time reduction from RNR to NPP and vice-versa. 

Proof. Let us construct a reduction of RNR to NPP. First we (1) delete the zero rows and columns of 
M and (2) normalize its columns such that M becomes column stochastic (columns are nonnegative 
and sum to one). One can easily check that it gives a polynomially equivalent RNR instance [9]. We 
then decompose M as the product of two rank-r matrices (using, e.g., reduction to row-echelon form) 

r 

M = AB <==> M.. t = AnBu Vi, (2.1) 
i=i 

where r = rank(M), A G M mxr and B G IR rxn . We observe that one can assume without loss of 
generality that the columns of A and B sum to one. Indeed, since M is column stochastic, at least 
one column of A does not sum to zero (otherwise all columns of AB = M would sum to zero). One 
can then update A and B in the following way so that their columns sum to one: 
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• For each column of A which sums to zero, add a column of A which does not sum to zero, and 
update B accordingly; 

• Normalize the columns of A such that they sum to one, and update B accordingly; 

• Observe that since the columns of A sum to one, M is column stochastic and since M = AB, 
the columns of B must also sum to one. 

In order to find a solution of RNR, we have to find U G M™ xfc and V £R k + xn such that M = UV 
and rank(£7) = r. For the same reasons as for A and B, U and V can be assumed to be column 
stochastic without loss of generality. Moreover, since 

M = UV = AB, 

and rank(M) = rank(j4) = rank(J7) = r, the column spaces of M, A and U coincide; implying that 
the columns of U must be a linear combination of the columns of A. The columns of U must then 
belong to the following set 

m 

Q = {ueR m \ue col(A), u>0 and ^ it» = 1}. (2.2) 

i=l 

One can then reduce the search space to the (r — l)-dimensional polyhedron corresponding to the 
coefficients of all possible linear combinations of the columns of A generating stochastic columns. 
Defining 

C(:,i) =A(:,i) - A(:,r) 1 < i < r - 1, and d = A(:, r), 

and introducing affine function / : M r_1 — > M m : x — > f(x) = Cx + d, which is injective since C is full 
rank (because A is full rank), this polyhedron can be defined as 

P = {xe M r_1 | A(:,l:r-l)x+ (l -^2x^A(:,r) > 0} = {x £ IT" 1 | f(x) > 0}. (2.3) 

i=l 

Note that B(l:r-l,i) G P Vj since M(:,j) = AB(:,j) = f(B(l:r-l,j)) > Vj. 
Let us show that P is bounded: suppose P is unbounded, then 

3x <E P, 3y / G R r_1 ,Va > : x + ay G P, 

^> C(x + ay) + d= (Cx + d) + aCy > 0. 

Since Cx + d > 0, this implies that Cy > 0. Observe that columns of C sum to zero (since the columns 
of A sum to one) so that Cy sums to zero as well; moreover, C is full rank and y is nonzero implying 
that Cy is nonzero and therefore that Cy must contain at least one negative entry, a contradiction. 
Notice that the set Q can be equivalently written as 

Q = {u G R m | u = f{x), x G P}. 

Noting X = [ Xl x 2 ... x k ] G R r ~ 1><k , f(X) = [f( Xl ) f(x 2 ) . . . f(x k )] = CX+[dd... d], we finally have 

3U G R mxk , V G R kxn column stochastic with rank(?7) = rank(M) and M = UV 

3 Xl ,x 2 ,...x k G P and V G M fcxn column stochastic s.t. M = /(J3(l:r-1, :)) = f(X)V = f(XV) 

^ B(l:r-1,:) = XV: 
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The first equivalence follows from the above derivations (i.e., U = f(X) for some X\,X2, ■ ■■Xk G P)', 
f(X)V = f(XV) because V is column stochastic (so that [dd . . . d]V = [dd . . . d]), and the second 
equivalence is a consequence from the fact that / is an injection. 

We have then reduced RNR to NPP: find the minimum number k of points Xi in P such that n 
given points (the columns of B(l:r— 1, :) constructed from the columns of M, which define the set S 
in the NPP instance) are contained in the convex hull of these points (since V is column stochastic). 
Because all steps in the above derivation are equivalences, we have actually also defined a reduction 
from NPP to RNR; to map a NPP instance to a RNR instance, we take 

M(:,i) = f( Si ) = Csi + d > 0, a< £ S 1 < i < n, 

and rank(M) = r because the n points Sj are not all contained in any hyperplane (they affinely span 
P). □ 

It is worth noting that M would be the slack matrix of P if S was the set of vertices of P (cf. 
Introduction). This will be useful later in Section T4.H 



2.2 Computational Complexity 
2.2.1 Rank-Three Matrices 

Using Theorem [TJ RNR of a rank-three matrix can be reduced to a two-dimensional nested polytopes 
problem^. Therefore, one has to find a convex polygon T with minimum number of vertices nested 
in between two given convex polygons S C P. This problem has been studied by Aggarwal et al. in 
PQ, who proposed an algorithm running in 0(p\og{k)) operational, where p is the total number of 
vertices of the given polygons S and P, and k is the number of vertices of the minimal nested polygon 
T. If M is a m-by-n matrix then p < m + n since S has n vertices, and the polygon P is defined by 
m inequalities so that it has at most m vertices. Moreover k = rank*_(M) < min(m, n) follows from 
the trivial solutions T = S and T = P. Finally, we conclude that one can compute the restricted 
nonnegative rank of a rank-three m-by-n matrix in 0((m + n) log(min(m, n))) operations. 

Theorem 2. For rank(M) < 3, RNR can be solved in polynomial-time. 

Proof. Cases r = 1, 2 are trivial since any rank-1 (resp. 2) nonnegative matrix can always be expressed 
as the sum of 1 (resp. 2) nonnegative factors [32]. 

Case r = 3 follows from Theorem [JJ and the polynomial-time algorithm of Aggarwal et al. [TJ. □ 

For the sake of completeness, we sketch the main ideas of the algorithm of Aggarwal et al. They 
first make the following observations: (1) any vertex of a solution T can be assumed to belong to 
the boundary of the polygon P (otherwise it can be projected back on P in order to generate a new 
solution containing the previous one), (2) any segment whose ends are on the boundary of P and 
tangent to S (i.e., S is on one side of the segment, and the segment touches S) defines a polygon with 
the boundary of P which must contain a vertex of any feasible solution T (otherwise the tangent point 
on S could not be contained in T), see, e.g., set Q on Figure [Q delimited by the segment [pi,j>2] & n d 
the boundary of P, and such that T n Q ^ for any feasible solution T. 

Starting from any point p\ of the boundary of P, one can trace the tangent to S and hence obtain 
the next intersection p2 with P. Point pi is chosen as the next vertex of a solution T, and the same 
procedure is applied (say k times) until the algorithm can reach the initial point without going through 
S, see Figure [TJ This generates a feasible solution T(jp\) = conv({pi,p2> • • ■ ,Pk})- Because of (1) and 
(2), this solution has at most one vertex more than an optimal one, i.e., k < rank^_(M) + 1 (since T 

4 See also Appendix I A . 1 1 where a MATLAB® code is provided. 

J Wang generalized the result for non-convex polygons [33]. Bhadury and Chandrasekaran propose an algorithm to 
compute all possible solutions [6]. 
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determines with the boundary of P k — 1 disjoint polygons tangent to S). Moreover, because of (1) 
and (2), there must exist a vertex of an optimal solution on the boundary of P between p\ and p2- 

The point p± is then replaced by the so called 'contact change points' located on this part of 
the boundary of P while the corresponding solution T(pi) is updated using the procedure described 
above. The contact change points are: (a) the vertices of P between p\ and p2, and (b) the points for 
which one tangent point of T{p\) on S is changed when p\ is replaced by them. This (finite) set of 
points provides a list of candidates where the number of vertices of the solution T could potentially be 
reduced (i.e., where p\ and pt could coincide) by replacing p\ by one of these points. It is then possible 
to check whether the current solution can be improved or not, and guarantee global optimality. In 
the example of Figure [JJ moving p\ on the (only) vertex of P between p\ and p2 generates an optimal 
solution of this RNR instance (since it reduces the original solution from 5 to 4 vertices). 

2.2.2 Higher Rank Matrices 

For a rank-four matrix, RNR reduces to a three-dimensional problem of finding a polytope T with 
the minimum number of vertices nested between two other polytopes S C P. This problem has been 
studied by Das et al. [To] and has been shown to be NP-hard when minimizing the number of faces 
of T (the reduction is from planar-3SAT). From this result, one can deduce using a duality argument^ 
that minimizing the number of vertices of T is NP-hard as well |16| [XT] . 

Theorem 3. For rank(M) > 4, RNR is NP-hard. 

Proof. This is a consequence of Theorem [1] and the NP-hardness results of Das et al. \17\ \15\ [TB] . □ 

Note however that several approximation algorithms have been proposed in the literature. For 
example, Mitchell and Suri [29] approximate rank^_(M) in case rank(M) = 4 within a 0(log(j>)) 
factor, where p is the total number of vertices of the given polygons S and P. Clarkson proposes 
a randomized algorithm finding a polytope T with at most r+0(5dln(r+)) vertices and running in 
0(r+ 2 p 1+s ) expected time (with = rank^_(M), d = rank(M) — 1 and 5 is any fixed value > 0). 

e Taking the polar of the three nested polytopes exchanges the roles of the inner and outer polytopes, and transforms 
face descriptions into vertex descriptions, so that the description of the inner and outer polytopes is unchanged but the 
intermediate polytope is now described by its vertices. 
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2.3 Some Properties 

In this section, we derive some useful properties of the restricted nonnegative rank. 

Example 1. Construct M using the following NPP instance: P is the three dimensional cube P = 
{x G M 3 | < X{ < 1, 1 < i < 3}, with 6 faces and S is the set of its 8 vertices S = {x G M 3 | xi G 
{0,1}, 1 < i < 3}. By construction, the convex hull of S is equal to P and the unique and optimal 
solution to this NPP instance is T = P = conv(S') with 8 vertices. By Theorem^ the corresponding 
matrix M of the RNR instance 

( 1 1 1 1 \ 
10 110 1 
110 10 10 
10 10 11 
1110 10 

V o o o i o i i i / 



M 



(2.4) 



has restricted nonnegative rank equal to 8 (note that its rank is 4 a nd its nonnegative rank is 6, see 
Section^) . 

It is well-known that for a matrix M G M™ xn , we have rank + (M) < min(m, n); surprisingly, this 
does not hold for the restricted nonnegative rank. 

Lemma 1. For M G M™ xn , 

rank!j_(M) < n but rank^_(M) ^ m. 

Proof. The first inequality is trivial since M = MI (I being the identity matrix). Example [T] provides 
an example when rank^_(M) = 8 for a 6-by-8 matrix M. □ 

Lemma [1] implies that in general rank^_(M) ^ rank^_(M T ), unlike the rank and nonnegative rank 
[12j . Note however that when rank(M) < 3, we have 



rank^_(M) < min(m, n), 

because the number of vertices of the outer polygon P in the NPP instance is smaller or equal to its 
number of facets m in the two-dimensional case or lower, and that the solution T = P is always feasible. 



Lemma 2. Let A G M+ Xrt and B G M+ Xr , then 



rank^ (LA 5]) < rank^(A) + rank^(S). 
Proof. Let (U a , V a ) and (Ub, Vb) be solutions of RNR for A and B respectively, then 



V a 

o v b 



[AB] = [U a U b ] 

and rank([C/ a Ub\) = rank([A B]) since col({7 a ) = col(A) and col(Ub) = col(-B) by definition. 



□ 



Lemma 3. iei M G l™ xn 

M = UV. Then 



r + < rank^_(M) 
Moreover, if M is symmetric, 

r + < rank^_(M) 



with rank(M) = r and rank+(M) = r+, U G M™ xr+ and V G R r + Xn with 
4> r < rank(J7) < r + and r < rank(y) < r+. 



r < rank(?7) < r + and r < r&nk(V) < r + . 
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Proof. Clearly, 



< rank(J7) < r + and r < iank(V) < r + . 



If rank(C7) = r, we would have rank^(M) = r + which is a contradiction, and rank(V) = r + would 
imply that V has a right pseudo-inverse V' , so that we could write U = MV* and then r < rank([/) < 
min(r, rank(F^)) < r, a contradiction for the same reason. 

In case M is symmetric, to show that rank(£7) < r + and rank(V) > r, we use symmetry and 
observe that UV = M = M T = V T U T . □ 



Corollary 1. Given a nonnegative matrix M , 
rank+(M) < rank(M) + 1 

If M is symmetric, 

rank+(M) < rank(M) + 2 => 

Proof. Let r = rank(M), r + = rank + (M), and U 6 
r + < rank^_(Af), by Lemma [31 we have 



rank + (M) = rank+(M). 



rank + (M) = rank+(M). 



C Xr+ and V GR r + xn such that M 



UV. If 



r < rank(C7) < r+ < rank^(M), 

which is a contradiction if rankl(M) < r + 1. If M is symmetric, we have rank([7) < r + and the 
above equation is a contradiction if rankl(M) < r + 2. □ 

For example, this implies that to find a symmetric rank-three nonnegative matrix with rank + (M) < 
rank^_(Af), we need rank^_(M) > rank(M) + 2 = 5 and therefore have to consider matrices of size at 
least 6-by-6 with rankl(M) = 6. 

Example 2. Let us consider the following matrix M and the rank-5 nonnegative factorization (U, V), 



M 



( 


1 


4 


9 


16 


25 \ 


1 





1 


4 


9 


16 


4 


1 





1 


4 


9 


9 


4 


1 





1 


4 


16 


9 


4 


1 





1 


^ 25 


16 


9 


4 


1 


o ) 



uv,u 



( 5 4 
3 11 
10 4 
10 4 
3 11 
\ 5 4 



1 \ 

1 

1 



1 / 



V 



( 1 3 5 \ 

5 3 1 

110 

1 1 

V o i o o i o / 



One can check that rank(M) = 3, and, using the algorithm of Aggarwal et al. JI]/, the restricted 
nonnegative rank can be computed^ and is equal to 6. Using the above decomposition, it is clear that 
rank + (M) < 5 < rank^L(M) = 6. By Lemma\^ for any T&nk+(M)-nonnegative factorization (U,V) 
of M, we then must have 3 < rank(J7) = 4 < rank + (M) implying that rank + (M) = 5. 

As we have already seen with Lemma [H the restricted nonnegative rank does not share all the 
nice properties of the rank and the nonnegative rank functions [12] . The next two lemmas exploit 
Example [2] further to show different behavior between nonnegative rank and restricted nonnegative 
rank. 



Lemma 4. Let A € 



^ xr , then 



l™ xn and B € 

rank^ (.4 + B) ^ rank^) + rank^B). 



7 The problem is actually trivial because each vertex of the inner polygon S is located on a different edge of the polygon 
P, so that they define with the boundary of P 6 disjoint polygons tangent to S. This implies that rank?j_(M) = 6, cf. 
Section r2.2.1l This matrix is actually a linear Euclidean distance matrix which will be analyzed later in Section \4. 21 
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Proof. Take M, U and V from Example [2] and construct A = U(:, 1:3)V(1:3, :) with rank+(A) = 3 
(since rank(A) = 3), 5 = C/(:,4:5)V(4:5, :) with rank+(B) = 2 (trivial) and rank^(A + B) = 6 since 
A + B = M. □ 

Lemma 5. lei A G M™ xn and B G M™ xr , tften 

rank+(L4B]) ^ rank^(A). 

where [AB] G ]g>™ x ( n + r ) denotes the concatenation of the columns of A and B. 

Proof. Let us take M, U and V from Example [21 and construct A = M and B = U(:,l) with 
rank^ ( L4 B] ) < 5 since rank(L4B]) = 4 (this can be checked easily) and [AB] = U[V e\] with 
rank(J7) = 4 (where ei denotes the i th column of the identity matrix of appropriate dimension). □ 

Lemma 6. Let B G K+ Xr and C G R r _f n , then 

rank^(BC) ^ min(rank^(B), rank^(C)). 
Proof. See Example [2] in which rank^(M) = 6 and rank^_(C7) < 5 by Lemma [TJ □ 



3 Lower Bounds for the Nonnegative Rank 

In this section, we provide new lower bounds for the nonnegative rank based on the restricted non- 
negative rank. Recall that the restricted nonnegative rank already provides an upper bound for the 
nonnegative rank since for a m-by-n nonnegative matrix M, 

< rank(M) < rank + (M) < rank+(M) < n. (3.1) 

Notice that this bound can only be computed efficiently in the case rank(Af ) = 3 (see Theorems [2] 
and ED. 

As mentioned in the introduction, it might also be interesting to compute lower bounds on the 
nonnegative rank. Some work has already been done in this direction, including the following 

1. Let M G M™ xri be any weighted biadjacency matrix of a bipartite graph G = {V\ U V2,E C 
Vi x V 2 ) with M(i,j) > <=^ (Vi(i),V 2 {j)) G E. A biclique of G is a complete bipartite 
subgraph (it corresponds to a positive rectangular submatrix of M). One can easily check 
that each rank-one factor {U±, V^-) of any rank-fc nonnegative factorization (U, V) of M can be 
interpreted as a biclique of M (i.e., as a positive rectangular submatrix) since M = Ym=i U-.iVi-. 
Moreover, these bicliques (U±,Vk-) must cover G completely since M = UV. The minimum 
number of bicliques needed to cover G is then a lower bound for the nonnegative rank. It 
is called the biclique partition number and denoted b(G), see [35] and references therein. Its 
computation is NP-complete |30i and is directly related to the minimum biclique cover problem 
(MBC). 

Consider for example the matrix M from Example [TJ The largest biclique of the graph G 
generated by M has 4 edge^f). Since G has 24 edges, we have b(G) > ^ = 6 and therefore 
6 < rank + (M) < min(m, n) = 6. 

A crown graph G is a bipartite graph with \V\\ = \V%[ = n and E = {(Vi(z), V2O')) I * ^ j} 
can be viewed as a biclique where the horizontal edges have been removed), de Caen, Gregory 
and Pullman [T2] showed that 

KG) = mm{.|„< ( L 4)}=0(log„). 

8 This can be computed explicitly, e.g., with a brute force approach. Note however that finding the biclique with the 
maximum number of edges is a combinatorial NP-hard optimization problem [3T]. It is closely related to a variant of 
the approximate nonnegative factorization problem [20| . 
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Beasley and Laffey [2] studied linear Euclidean distance matrices denned as M(i,j) = (oj — aj) 2 
for 1 < i, j < n, en G M, aj ^ aj i ^ j. They proved that such matrices have rank three and that 

min { k \ n < ( , , , | \ < r+ which means n < ( , + , , | , (3.2) 

where r + = rank + (M). In fact, such matrices are biadjacency matrices of crown graphs (only 
the diagonal entries are equal to zero). 

2. Goemans makes [22] the following observation: the product UV of two nonnegative matrices 
U G R™ xk and V G IR^ xri generates a matrix M with at most 2 k columns (resp. 2 k rows) 
with different sparsity patterns. In fact, the columns (resp. rows) of M are additive linear 
combinations of the k columns of U (resp. rows of V) and therefore no more than 2 k sparsity 
patterns can be generated from these columns (resp. rows). Therefore, letting s p be the maximum 
between the number of columns and rows of M G M™ xn having a different sparsity pattern, we 
have 

rank+(M) > log 2 (s p ). 
In particular, if all the columns and rows of M have a different sparsity pattern, then 

rank + (M) > log 2 (max(m, n)). (3-3) 

Goemans then uses this result to show that any extended formulation of the permutahedron in 
dimension n must have f2(ralog(n)) variables and constraints. In fact, 

• The minimal size s is of the order of the nonnnegative rank of its slack matrix plus n (cf. 
Introduction). 

• The slack matrix has n! columns (corresponding to each vertex of the polytope) with 
different sparsity patterns (cf. Equation (jl.ip b 

This implies that 

s = 9(rank + (5 M ) + n) > 6(log(n!)) = 6(nlog(n)). 

In this section, we provide some theoretical results linking the restricted nonnegative rank with 
the nonnegative rank, which allow us to improve and generalize the above results in Section 2] for both 
slack and linear Euclidean distance matrices. 



3.1 Geometric Interpretation of a Nonnegative Factorization as a Nested Poly- 
topes Problem 

In the following, we lay the groundwork for the main results of this paper, introducing essential 
notations and observations that will be extensively used in this section. We rely on the geometric 
interpretation of the nonnegative rank, see also |19| El [33] where similar results are presented. The 
main observation is that any rank-A; nonnegative factorization (U, V) of a nonnegative matrix M can 
be interpreted as the solution with k vertices of a nested polytopes problem in which the inner poly- 
tope has dimension rank(M) — 1 and the outer polytope has dimension rank(?7) — 1. 

Without loss of generality, let M G M™ xn , U G M+ xfc and V G M^ xn be column stochastic with 
M = UV (cf. proof of Theorem[TJ the columns of M are convex combination of the columns of U). If the 
column space of U does not coincide with the column space of M, i.e., r u = rank(C7) > rank(M) = r, 
it means that the columns of U belong to a higher dimensional affine subspace containing the columns 
of M (otherwise, see Theorem [1]). 
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Let factorize U = AB where A € R mxr " and B € M T * uXr + are full rank and their columns sum to 
one. As in Theorem [U we can construct the polytope of the coefficients of the linear combinations of 
the columns of A that generate stochastic vectors. It is defined as 

r u — l 

P u = {x G W u ~ x | f u (x) = A(:, l:r u -l)x + (l - a*) A(:,r„) > 0}. 

i=i 

Since col(M) C col(C/), there exists B' S jj r u xn whose columns must sum to one such that 
M = AB' . Since rank(M) = r and A is full rank, we must have rank(i?') = r. By construction, 
the columns of B u = B(l:r u — 1, :) (corresponding to the columns of U) and B m = B'(l:r u — 1,:) 
(corresponding to the columns of M) belong to P u . Note that since rank(S') = r, the columns of B m 
live in a lower (r — l)-dimensional polytope 

P m = {x e W-- 1 | f u (x) > 0, f u {x) e col(M)} c P u . 

Polytope P m contains the points in P u generating vectors in the column space of M. 
Moreover 

M = AB' = UV = ABV, 

implying that (since A is full rank) 

B' = BV and B m = B U V. 

Finally, the columns of B m are contained in the convex hull of the columns of B u , inside P u , i.e., 

conv(B m ) C conv(B u ) C P u . 

Defining the polytope T as the convex hull of the columns of B u , and the set of points S as the 
columns of B m , we can then interpret the nonnegative factorization (U, V) of M as follows. The 
(r u — l)-dimensional polytope T with k vertices (corresponding to the columns of U) is nested between 
a inner (r — l)-dimensional polytope conv(S) (where each point in S corresponds to a column of M) 
and a outer (r u — l)-dimensional polytope P u . 

Let us use the matrix M and its nonnegative factorization (U, V) of Example [2] as an illustration: 
rank(M) = 3 so that P m is a two-dimensional polytope and contains the set of points S, while 
rank(C7) = 4 and defines a three-dimensional polytope T containing S, see Figure [2j 

3.2 Upper Bound for the Restricted Nonnegative Rank 

From the geometric interpretation introduced in the previous paragraph, we can now give the main 
result of this section. The idea is the following: using notations of Section [3.11 we know that (1) the 
polytope T (whose vertices correspond to the columns of U) contains the (lower dimensional) set of 
points S (corresponding to the columns of M), and (2) S is contained in P m (which corresponds to 
the set of stochastic vectors in the column space of M). Therefore, the intersection between T and 
P m must also contain S, i.e., the intersection Tn P m defines a polytope which (1) is contained in the 
column space of M, and (2) contains S. Hence its vertices provide a feasible solution to the RNR 
problem, and an upper bound for the restricted nonnegative rank can then be computed. 

In other words, any nonnegative factorization (U, V) of a nonnegative matrix M can be used to 
construct a feasible solution to the restricted nonnegative rank problem. One has simply to compute 
the intersection of the polytope generated by the columns of U with the column space of M (which 
can obviously increase the number of vertices). 

Theorem 4. Using notations of Section \3.1l we have 

rank^(M) < # vertices(T n P m ). (3.4) 
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Figure 2: Illustration of the solution from Example [2] as a nested polytopes problem, with rank(M) = 
3 < rank(C7) = 4 < rank + (M) = 5 < rank^(M) = 6 = n. See Appendix IA.2I for the code used to 
perform the reduction. 



Proof. Let xi, X2, ■ ■ ■ , x v be the v vertices of T n P m and note X = [xy xi . . . x v ] which has rank at 
most r (since it is contained in the (r — l)-dimensional polyhedron P m ). By construction, 

B m {:,j) e T n P m = conv(X) 1 < j < n. 

Therefore, there must exist a matrix V* E M vxn column stochastic such that 

B m = XV*, 

implying that 

M = f u (B m ) = f u (XV*) = f u (X)V* = U*V*, 

where U* = f u {X) £ ^ mxv [ s nonnegative since Xi G P m C P u Vi, and U* has rank r since M = 
U*V* implies that its rank is at least r and U* = f u (X) that it is at most r. The pair (U*,V*) is 
then a feasible solution of the corresponding RNR problem for M and therefore rank^_(M) < v = 
# vertices (T n P m ). □ 



3.3 Lower Bound for the Nonnegative Rank based on the Restricted Nonnegative 
Rank 

We can now obtain a lower bound for the nonnegative rank based on the restricted nonnegative rank. 
Indeed, if we consider an upper bound on the quantity # vertices(T n P m ) that increases with the 
nonnegative rank (i.e., the number of vertices of T), we can reinterpret Theorem 2] as providing a 
lower bound on the nonnegative rank. For that purpose, define the quantity faces(n, d, k) to be the 
maximal number of /c-faces of a polytope with n vertices in dimension d. 

Theorem 5. The restricted nonnegative rank of a nonnegative matrix M with r = rank(M) and 
r + = rank + (M) can be bounded above by 

rank^_(M) < max faces(r + ,r u — l,r u — r). (3-5) 

r<r u <r + 
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Proof. Let (U,V) be a rank-r + nonnegative factorization of M with rank(C7) = r u . Using notations 
of Section 13. II and the result of Theorem U rank^_(M) is bounded above by the the number of vertices 
of T n P m . Defining Q m = {x £ W^' 1 \ f u (x) £ col(M)}, we have P m = Q m n P u and since T C P u , 

p m n t = Q m n P u n T = Q m n t. 

Since Q m is (r — l)-dimensional, the number of vertices of T n Q m is bounded above by the number 
of (r u — r)-faces of T (in a (r u — l)-dimensional space, (r u — r)-faces are defined by r — 1 equalities), 
we then have 

rank^(Af) < jf vertices(T n P m ) = # vertices(T n Q m ) < faces(r + , r u — 1, r u — r). 

Notice that for r u = r, faces(r + ,r — 1,0) = r + which gives r + = rank^_(M) as expected. Finally, 
taking the maximum over all possible values of r < r u < r + gives the above bound (|3.5p . □ 

We introduce for easier reference a function <f> corresponding to the upper bound in Theorem [5l 

i.e., 

<fi(r, r + ) = max faces(r + , r u — 1, r u — r). 

r<r u <r + 

Clearly, when r is fixed, <p is an increasing function of its second argument r + , since faces(n, d, k) 
increases with n. Therefore inequality rank^_(M) < (p(r, r + ) from Theorem O implicitly provides a 
lower bound on the nonnegative rank r + that depends on both rank r and restricted nonnegative rank 
rank^M). 

Explicit values for function <f> can be computed using a tight bound for faces(n, d, k) attained by 
cyclic polytopes [371 P-257, Corollary 8.28] 

where ^ * denotes a sum where only half of the last term is taken for i = i if d is even, and the whole 
last term is taken for i = [^J = if d is odd. Alternatively, simpler versions of the bound can be 
worked out in the following way: 

Theorem 6. The upper bound cj>(r, r + ) on the restricted nonnegative rank of a nonnegative matrix M 
with r = rank(M) and r + = rank + (M) satisfies 

(p(r,r + ) = max faces(r+, r u — 1, r u — r) 

r<r u <r + 

< max ( r+ )<( t r + \<r + .[H<r + . 

r<r u <r+ \r u - T + I J \[r + /2\J " \j 7Tr+ 

Proof. The first inequality follows from the fact that faces(n, d, k — 1) < (?) , since any set of k distinct 
vertices defines at most one k — 1-face. The second follows from the maximality of central binomial 
coefficients. The third is a standard upper bound on central binomial coefficients, and the fourth is 
an even cruder upper bound. □ 

We will see in Section 4 that some of these weaker bounds correspond to existing results from the 
literature. 

When matrix M is symmetric, the bound can be slightly strengthened, leading to a different 
function <f>': 
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Corollary 2. Given a symmetric matrix M with r + = rank + (M) ; r = rank(Af) and r + > r + 1, we 

have 

rank+(M) < max faces(r + ,r n — l,r u — r) = (j)'(r, r+) < <p(r, r + ). 

Proof. We have seen in Lemma [3] that for symmetric matrices r u = r + implies rank^(M) = r + . 
Therefore, in case r + > r + 1, one can strengthen the result of Theorem [5] and only consider the range 
r < r u < r+ — 1. □ 

3.3.1 Improvements in the rank-three case 

It is possible to improve the above bound by finding better upper bounds for # vertices(T n P m ) in 
Equation (|3.4p . For example, since two-dimensional polytopes (i.e., polygons) have the same number 
of vertices (0- faces) and edges (1-faces), we have for rank(M) = 3 that 

# vertices(T n P m ) = #edges(T n P m ). 

Using the same argument as in Theorem [SJ the number of edges of T n P m is bounded above by the 
number of (r u — r + l)-faces of T (defined by r — 2 equalities) leading to 

Corollary 3. The restricted nonnegative rank of a rank-three nonnegative matrix M with r + = 
rank + (M) can be bounded above with 

rankl(M) < max min faces (r + , r u — l,r u — 3 + i) < <j)(3, r + ). (3.6) 

3<r u <r+ j=0,l 

The minimum taken between and 1 simply accounts for the two possible cases, i.e., the bound 
based on # vertices (T n P m ) with i = as in Theorem \5\ or based on #edges(T n P m ) with i = 1. A 
similar bound holds in the symmetric case. 

4 Applications 

So far, we have not provided explicit lower bounds for the nonnegative rank. As we have seen, 
inequalities (|3.5|) and (|3.6p can be interpreted as implicit lower bounds on the nonnegative rank r+, 
but have the drawback of depending on the restricted nonnegative rank, which cannot be computed 
efficiently unless the rank of the matrix is smaller than 3 (Theorems [5] and ED • 

Nevertheless, we provide in this Section explicit lower bounds for the nonnegative rank of slack 
matrices (Section 14. 1 j) and linear Euclidean distance matrices (Section I4.2|) . cf. introduction of Sec- 
tion [3l These bounds are derived by showing that the restricted nonnegative rank of such matrices is 
maximum, i.e., it is equal to the number of columns of these matrices (cf. Lemma [1]). 

4.1 Slack Matrices 

Let start with a simple observation: it is easy to construct a m x n matrix of rank r < min(m, n) with 
maximum restricted nonnegative rank n: 

1. Take any (r — l)-dimensional polytope P with n vertices. 

2. Construct a NPP instance with S = vertices (P). 

3. Compute the corresponding matrix M in the equivalent RNR instance. 

Clearly, the unique solution for NPP is T = P = conv(jS') and therefore the matrix M in the corre- 
sponding RNR instance must satisfy: rank^_(M) = ^ vertices(T) = n; see Example[T]for an illustration 
with the three-dimensional cube. 
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Remark 1. The matrices constructed as described above also satisfy 

rank(M) < rank+(M). 

Otherwise rank+(M) = rank+(M) = rank(M) < min(m, n) which is a contradiction. This is inter- 
esting because it is nontrivial to construct matrices with rank(M) < rank + (M) \2Sj . In fact, it is easy 
to check that generating randomly two nonnegative matrices U and V of dimensions m x r and r x n 
respectively, and constructing M = UV will generate a matrix M of rank r will probability one. 

In the context of compact formulations (cf. Section [T]), the aim is to express a polytope Q with 
fewer constraints by using some additional variables, i.e., find a lifting of polynomial size. A possible 
way to do that is to compute a nonnegative factorization of the slack matrix Sm of Q }36] (see 
Equation (jl.ip ). The next theorem states that the restricted nonnegative rank of any slack matrix 
Sm G M^"" is maximum (/ is the number of facets of Q, v its number of vertices), i.e., rank*. (Sm) = v. 
This is directly related to the above observation: the slack matrix of a polytope Q corresponds to a 
NPP instance where Q is the outer polytope and its vertices are the points defining the inner polytope. 
Notice that the restricted nonnegative rank used as an upper bound for the nonnegative rank is useless 
in this case. 

Theorem 7. Let Q = {x G M q | Fx > h, Ex = g} be a p- dimensional polytope with v vertices, v > 1, 
and let Sm(Q) be its slack matrix, then rankl (Sm(Q)) = v - 

Proof. In order to prove this result, we first construct a bijective transformation L between Q and a 
full-dimensional polytope P C M p . The vertices of P can then be easily constructed from the vertices 
of Q, which allows to show that P and Q share the same slack matrix. Finally, using the result of 
Theorem [U we show that the slack matrix of P has maximum restricted nonnegative rank. 

Since Q is a p-dimensional polytope, there exists a polytope P C MP and a bijective affine trans- 
formation 

L : Q -»• P : x -»• L{x) = Ax + b and ZT 1 : P -»• Q : y -> L" 1 ^) = A ] y - A^b, 

such that P = L(Q) and Q = L _1 (P) (where A G W xq has full rank, G R qxp is its right inverse 
and b G W p ). 
By construction, 

P = {yeW>\y = L(x),xeQ} = {yeRP\L~ 1 (y)eQ}, 
= {yeR p \FL~ 1 (y)>h,EL~ 1 (y)=g}, 
= {y eW\FA ] y>h + FAh}, 

since the equalities EL~ l (y) = g must be satisfied for all y G W since P is full-dimensional. 

Noting C = FA^ and d = h + FA\ we have P = {y G R q | Cy > d}. Finally, we observe that 

1. Noting Uj's the v vertices of Q, we have that L(vi)'s define the v vertices of P. This can easily 
be checked since L is bijective (Vy G P, 3\x G Q s.t. y = L(x) and vice versa). 

2. P can be taken as the outer polytope of a NPP instance, i.e., P is bounded and (C d) is full 
rank. P is bounded since Q is. C is full rank because P has at least one vertex (v > 1). If (C d) 
was not full rank, then 3z G MP such that d = Cz, implying that z G P. Since P has at least 
two vertices (v > 1), By G P with y ^ z, and one can check that y + a(y — z) G P Va > 0. This 
is a contradiction because P is bounded. 
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3. The slack matrix of P is equal to the slack matrix of Q: 



S M (P) = CL(V)-[d ... d]=FA^L(V)-[h + FA^b ... h + FA^b] 
= F(A^L(V) - [A^b ... A^b]) - [h ... h] 
= FlT x {L(y)) - [h ... h] = FV - [h ... h] 

= Sm(Q), 

where V = [v± 1)2 ■ ■ ■ v v ] is the matrix whose columns are the vertices of Q, and L(V) = 
[L(v\) L{y2) ■ ■ ■ L(v v )] is the matrix whose columns are the vertices of P. 

4. The NPP instance with P as the outer polytope and its v vertices L(uj)'s as the set of points S 
defining the inner polytope has a unique and optimal solution T = P = conv(S') with v vertices. 
The matrix M in the RNR instance corresponding to this NPP instance is given by the slack 
matrix Sm(P) of P implying that its restricted nonnegative rank is equal to v (cf. Theorem [1]). 

We can conclude that rank+(Sj\,f (Q)) = v. □ 

We can now derive a lower bound on the nonnegative rank of a slack matrix and on the size of an 
extended formulation, by combining Theorem [5] (cf. Equation (|3.5|) ) . Theorem [61 Theorem [7] and the 
result of Yannakakis |36] (see also Section [1]) . 

Corollary 4. Let P be a polytope with v vertices and let Sm £ M+ X1; be its slack matrix of rank r 
(i.e., P has dimension r — 1), then 

v < <Hr,r + ) = Mr + ) < max ( r+ ) < ( U ) \ < 2 r+ , (4.1) 

r<r u <r+ \r u - T + I J \ [r + /2\ ) 

where r + = rank+(5M ). Therefore, the minimum size s of any extended formulation of P follows 

s = 6(r+ + n) > e(<fr ») > e(log 2 (u)), 
where <p~ 1 (-) is the inverse of the nondecreasing function <p r (-) = (j)(r, •). 

The last bound 2 r + from Equation (|4. 1[) is the one of Goemans \22\ Theorem 1] (see introduction 
of Section [3|), and therefore Corollary U provides us with an improved lower bound, even though it is 
still in 0(log 2 (u)). It is actually not possible to provide an unconditionally better bound (i.e., without 
making additional hypothesis on the polytope P): since Goemans showed that the size of any LP 
formulation of the permutahedron (with v = n! vertices) must be in 0(nlog(n)), this implies that the 
nonnegative rank of its slack matrix is in 0(nlog(n)). 



4.2 Linear Euclidean Distance Matrices 

Linear Euclidean distance matrices (linear EDM's) are defined by 

M(i,j) = (oi - Oj) 2 , 1 < i,j < n, for some a G R n . (4.2) 

In this section we assume <Zj 7^ aj i ^ j, so that these matrices have rank three. Linear EDM's were 
used in [2] to show that the nonnegative rank of a matrix with fixed rank (rank 3 in this case) can be 
made as large as desired (while increasing the size of the matrix), implying that an upper bound for 
the nonnegative rank of a matrix based only on the rank cannot exist. 

We refer the reader to [25 J and the references therein for detailed discussions about Euclidean 
distance matrices, and related applications. 
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4.2.1 Restricted Nonnegative Rank of Linear Euclidean Distance Matrices 

We first show that the restricted nonnegative rank of linear EDM's is maximum, i.e., it is equal to 
their dimension n. 

Definition 2. The columns of a matrix M have disjoint sparsity patterns if and only if 

si £ sj, Mi / j, 

where Sj = {k\M(k,i) = 0} is the sparsity pattern of the i th column of M . 

Theorem 8. Let M be a rank-three nonnegative square matrix of dimension n whose columns have 
disjoint sparsity patterns, then 

rankl(M) = n. 
In particular, linear EDM's have this property. 

Proof. Let P, S and T be the polygons defined in the two-dimensional NPP instance corresponding 
to the RNR instance of M (cf. Theorem [TJ). Aggarwal et al. [1] observe that if two points in S are on 
different edges of P, they define a polygon with the boundary of P (see each dark regions in Figure [3]) 
which must contain a point of the solution T. Otherwise these two points could not be contained in 
T (see also Section [2.2. ip . Therefore if each point of S is on a different edge of the boundary of P, 



Figure 3: Illustration of the restricted nonnegative rank of a linear EDM of dimension 5. The solution 
T must contain a point in each dark region, that is rank^(M) = \T\ = \S\ = 5. 

any solution T to NPP must have at least |5| = n vertices since S defines n disjoint polygons with 
the boundary of P. Finally, two points x\ and X2 in S are on different edges of the boundary of the 
polytope P = { x G M 2 | Cx + d > 0} if and only if (Cx\ + d) and (Cx2 + d) have disjoint sparsity 
patterns or, equivalently, if and only if the two corresponding columns of M (which are precisely equal 
to Cx\ + d and Cx2 + d) in the RNR instance have disjoint sparsity patterns. Indeed, for two vertices 
a and b to be located on different edges, one needs at least (1) one inequality that is active at a and 
inactive at b and (2) another inequality that is active at b and inactive at a. This is equivalent to 
requiring the sparsity patterns of the corresponding columns of the slack matrix to be disjoint. □ 

Remark 2. This result does not hold for higher rank matrices. For example, the matrix 
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has rank(M) = 4 and rank^_(M) < 5 since rank(C7) = 4. Therefore we cannot conclude that higher 
dimensional Euclidean distance matrices have maximal restricted nonnegative rank. 

4.2.2 Nonnegative Rank of Linear Euclidean Distance Matrices 

Since linear EDM's are rank-three symmetric matrices, one can combine the results of Theorem [8] with 
Corollary [3] (cf. Equation (|3.6[) ) and Corollary [2] in order to obtain lower bounds for the nonnegative 
rank of linear EDM's. 

Corollary 5. For any linear Euclidean distance matrix M , we have 

rankl(M) = n < max min faces (r+ , r u — l,r u — r + i) 

3<r u <r+ — l i=0,l 

< max faces(r + , r u — 1, r u — r) = <p'(r, r + ) 

3<r u <r+-l 

" (k+/2j) 

< 2 r +. 

We observe that our results (first two inequalities above, from Theorem [5] and Corollary [3]) 
strengthen the bounds from Equations (|3.2|) (Beasley and Laffey [2]) and f|3.3|) (Goemans [25]). Fig- 
ure [J] displays the growth of the different bounds, and Table [1] compares the lower bounds on the 
nonnegative rank for small values of n. For example, for a linear EDM to be guaranteed to have 




n = rank*(M) 



Figure 4: Comparison of the different bounds for symmetric re-by-n matrices, with rank^(M) = n. 

nonnegative rank 10, the bounds requires respectively n = 50 (|3.6|) . n = 150 (|3.5p . n = 252 (|3.2p and 
n = 1024 ()3.3p . This is a significant improvement, even though all the bounds are still of the same 
order with r + G 0(log(n)). 

Is it possible to further improve these bounds? Beasley and Laffey [2] conjectured that the non- 
negative rank of linear EDM's is maximum, i.e., it is equal to their dimension. Lin and Chu [28} 
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dimension n 


4 5 6 7 8 9 10 


Equation (|3.6p 
Equation (|3.5p 
Beasly and Laffey (|3.2|) 
Goemans (13.31) 


4 5 5 6 6 6 7 
4 5 5 5 5 5 6 
4 4 4 5 5 5 5 
3 3 3 3 4 4 4 



Table 1: Comparison of the lower bounds for the nonnegative rank of linear EDM's. 



Theorem 3.1] claim to have proved that this equality always holds, which cannot be correct because 
of the following exampl^l. 



Example 3. Taking M G 



D 6x6 



with 
M(i,j) 



jf, 1 < *, J < 6, 



gives rank + (M) = 5 . In fact, 
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(4.3) 



(4.4) 



so that rank + (Af) < 5, and rank + (M) > 5 is guaranteed by Equation (|3.6p . see Table [7] with 
rankl(M) = n = 6 (or by Lemma\^ see Example^). 

Example [3] proves that linear EDM's do not necessarily have a nonnegative rank equal to their 
dimension. In fact, we can even show that 

Theorem 9. Linear EDM's of the following form 

M n (i,j) = (i-j) 2 l<i,j<n, 

satisfy 



rank+(M„) < 2 + 

where \x] is the smallest integer greater or equal to x. 

9 In their proof, they actually show that the restricted nonnegative rank is maximum (not the nonnegative rank), see 
Theorem[8] In fact, they only consider the case when the vertices of the solution T (corresponding to the columns of U) 
belong to the low-dimensional afhne subspace defined by S (corresponding to the column of M) in the NPP instance. 
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Proof. Let first assume that n is even and define 
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where I m is the identity matrix of dimension m and P m is the permutation matrix with P m (i,j) 
I m (i, m — j + 1) Vi, j; see Equation ()4.4p for an example when n = 6. One can check that 



M„ 



UV 



M n/2 
A T + P n/2 M 



n/2 lvl n/2 



A + P„ /2 M„/2 \ . . „ 
, 7 7 , with ^ 



/ n- 1 \ 

n — 3 



/ 1 \ 

3 

n — 3 
\n-l J 



T 



If n is odd, we simply observe that rank + (M n ) < rank + (M„ + i) < 2 + = 2 + [| 



submatrix of M, 



n+l 



since M n is a 
□ 



Remark 3. in i/ze construction of Theorem^ one can check that rank(V) = 4 and the factorization 
can then be interpreted as a nested polytopes problem (corresponding to M T = V T U T ) in which the 
outer polytope has (only) dimension 3. Therefore, there is still some room for improvement and 
rank + (M n ) is probably (much?) smaller. 

This example also demonstrates that, in some cases, the structure of small size nonnegative fac- 
torizations (in this case, the one from Example^ can be generalized to larger size nonnegative fac- 
torization problems. This might open new ways to computing large nonnegative factorizations. 

In Example O the nonnegative rank is smaller than the restricted nonnegative rank because there 
exists a higher dimensional polytope with only 5 vertices whose convex hull encloses the 6 vertices 
defined by the columns of M. Nested polytopes instance corresponding to the RNR instance with M 
given by Example [3] and the two above solutions are illustrated on Figures [2] and [5] respectively (note 
that they are transposed to each other, but correspond to different solutions of the NPP instance), 
see Section 13.11 Notice that the second solution (Figure [5]) completely includes the outer polytope P; 
therefore, the nonnegative rank of any nonnegative matrix with the same column space as the matrix 
M will be at most 5. 

The solutions of the above nonnegative rank problem have been computed with standard nonneg- 
ative matrix factorization algorithms |26} [TU] and, in general, the optimal solution is found after 10 to 
100 restarts of these algorithms^]. 

We also observed than when the vectors a in Equation (|4.2p used to construct the linear EDM's 
are chosen randomly, the nonnegative rank seems to be maximal (i.e., equal to the dimension of the 
matrix). In fact, even with 1000 restarts of the NMF algorithms with several random linear EDM's 
(of dimensions up to n = 12, and using a factorization rank of n— 1), every stationary point we could 



10 These algorithms are based on standard nonlinear optimization schemes (rescaled gradient descent and block- 
coordinate descent), and require initial matrices (U,V), which were randomly generated. 
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Figure 5: Illustration of the solution from Equation (|4.4p as a nested polytopes problem, based on a 
linear EDM with rank(M) = 3 < rank(J7) = 4 < rank + (M) = 5 < rank+(M) = 6 = n. 



obtain had an error (= X^(M — UV)fj) bounded away from zero. The following related question is 
still open: 

Question 1. Does there exist a nonnegative (symmetric?) nxn square matrix M such that rank(M) = 
3 and rank + (M) = n, for each n > 6? 

Table [Tj implies that linear EDM's with n < 5 satisfy this propertJ^I. 

We adapt the conjecture of Beasley and Laffey [2] as follows: 

Conjecture 1. Random linear EDM's of dimension n are such that rank(M) = 3 and rank + (M) = n 
with probability one. 

4.3 The Nonnegative Rank of a Product 

Beasley and Laffey [2] proved that for A = BC with A, B and C > 

rank-t-(A) < rank(-B) rank(C). 
In particular, rank + (A 2 ) < rank(A) 2 . They also conjectured that for a nonnegative nxn matrix A, 

rank + (^4 2 ) < rank(A), 

which we prove to be false with the following counterexample (based on a circulant matrix) 
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11 Recall we assumed ttj =^ aj Vi 7^ j so that such linear EDM's have rank three [2]- 
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where a = l + y/2. In fact, one can check that rank(^4) = 3 and rank^A 2 ) = 4: indeed, rank^_(^4 2 ) = 4 
can be computed with the algorithm of Aggarwal et al. [T] (see Figure for an illustration) and, by 
Corollary [H rank + (^ 2 ) = rank+(,4 2 ) since rank+(A 2 ) < rank(A 2 ) + 1 = 4. 




"-3 -2 -1 1 2 3 4 5 

Figure 6: Illustration of a NPP instance corresponding to A 2 and an optimal solution T, cf. Equa- 
tion (|4.5|) . See Appendix I A. II for the code used to perform the reduction. 



Remark 4. The matrix A from Equation (j4.5|) is the slack matrix of a regular octagon with sides of 
length y/2. By Theorem^ we have rank^(A) = 8. Notice also that A has rank 3 and its columns 
have disjoint sparsity patterns so that i&nk+(A) = 8 is implied by Theorem as well. What is the 
nonnegative rank of A? Defining 
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we have that B = AR is symmetric, has rank 3 and only has zeros on its diagonal. By Theorem^ 
rank^-B) = 8. Using TableUl we have rank + (i?) > 6. Moreover 

rank + (^4i?) < min(rank+(-A),rank+(i2)), 

implying that 6 < rank + (i?) < rank + (^4). Finally, rank + (^4) = 6 because 
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(4.6) 
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Figure 7: Illustration of a nested polytopes instance corresponding to A and an optimal solution, cf. 
Equations (03} and 



wi/i rank([7) = 4 and rank(V) = 5. Figure\7\ displays the corresponding nested polytopes problem, see 
Section I3.il and Appendix \A.£l 

It is interesting to observe that, from this nonnegative factorization, one can obtain an extended 
formulation (lifting) Q of the regular octagon P = {x £ R 2 \ Cx < d], defined as Q = {(x,y) £ 
R 2 xl 6 | Cx + Uy = d,y> 0}, with 

r-f 1 -^l 2 ~ l -^/ 2 \ T 

~ V y/2/2 1 y/2/2 -V2/2 -1 -V2/2 J ' 

and d(i) = 1 + ^ Vi, see Equation (|1.2j) . S'ince i/ie system of equalities Cx + [7y = d on/y defines 

4 linearly independent equalities (rank([C ?7]) = 4j, t/ie description of Q can then be simplified and 
expressed with 4 variables and 6 inequality constraints. 

This extended formulation is actually a particular case of a construction proposed by Ben- Tal and 
Nemirovski J3]/ to find an extended formulation of size 0(k) for the regular 2 k -gon in two dimensions. 

5 Concluding Remarks 

In this paper, we have introduced a new quantity called the restricted nonnegative rank, whose com- 
putation amounts to solving a problem in computational geometry consisting of finding a polytope 
nested between two given polytopes. This allowed us to fully characterize its computational complex- 
ity (see Table [2]) . This geometric interpretation and the relationship between the nonnegative rank 
and the restricted nonnegative rank also let us derive new improved lower bounds for the nonnegative 
rank, in particular for slack matrices and linear Euclidean distance matrices. This also allowed us to 
provide counterexamples to two conjectures concerning the nonnegative rank. 

We conclude the paper with the following conjecture: 

Conjecture 2. Computing the nonnegative rank and the corresponding nonnegative factorization of 
a nonnegative matrix is NP-hard when the rank of the matrix is fixed and greater or equal to 4 ( or 
even possibly 3). 
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In fact, we have shown that computing a nonnegative factorization amounts to solving a nested 
polytopes problem in which the outer polytope might live in a higher dimensional space. Moreover, this 
space is not known a priori (we just know that it contains the columns of the matrix to be factorized, 
cf. Section Rj.ip . Therefore, it seems plausible to assume that this problem is at least as difficult than 
the restricted nonnegative rank computation problem in which the outer polytope lives in the same 
low-dimensional space and is known. Moreover, even in the rank-three case, even though the inner 
polytope has dimension two, the outer polytope might have any dimension (up to the dimensions of the 
matrix; see, e.g., Figures [2] and [5]) ; therefore, it seems that the nonnegative rank computation might 
also be NP-hard if the rank of the matrix is three. Notice that, when rank^(M) < 5, Equation (|3.5p 
implies rank + (M) = rank^_(M) so that the nonnegative rank can be computed in polynomial-time in 
this particular case. 

Table [2] recapitulates the complexity results for the restricted nonnegative rank and the nonnegative 
rank of a nonnegative matrix M. 



r = rank(M) 


r* + = rank^(M) 


r + = rank + (Af) 


r not fixed 
r > 4 fixed 
r = 3 

r < 2 


NP-hard 
NP-hard (Theorem ED 
polynomial (Theorem [2|) 

trivial (= r) 


NP-hard [33] 
NP-hard? 
polynomial if ri. < 5 
otherwise NP-hard? 
trivial (= r) [32] 



Table 2: Complexity of restricted nonnegative rank and nonnegative rank computations. 
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A MATLAB Codes 

In this Appendix, codes for two specific reductions are provided: 

A.l The reduction from any RNR instance of a rank-three matrix to a two-dimensional NPP instance. 

A. 2 The geometric interpretation and the visualization of a nonnegative factorization M = UV when 
rank(M) = 3 and rank(?7) = 4 as a solution of a nested polytopes problem, where the inner 
polytope S is two-dimensional and the outer polytope P is three-dimensional (see Section f3.1|) . 

A.l RNR to NPP when rank(M) = 3 

The following code has been used to generate Figure [6l 

% 2— D Representation of a NPP instance corresponding to the RNR instance 
% of a rank— 3 nonnegative matrix M, cf. Theorem 2. 

% [P,A, B] = NMFrank3(M) 

% Input . 

% M : (m x n) matrix or rank 3. 

% Output . 

% P (2 x p<m) : vertices of the outer simplex defined with Cx+d>0 . 

% A (m x 3 ) , B ( 3 x n) : M = AB and columns of A and B sum to one. 

function [P,A,B] = NMFrank3 (M) 

if rank (M) > 3 || min (M ( : ) ) < 

disp('The matrix is not rank 3 or not nonnegative'); return; 

end 

k = 3; [m, n] = size (M) ; 

% 1. Remove zero row/columns and normalize the columns of M 
M = M (sum (M ' ) >0 , sum (M) >0 ) ; D = diag (1 . /sum (M) ) ; M = M*D; 
% 2. Compute the decompositon M = AB 
[A,B] = basesumtoone (M, 3 ) ; 

% 3. Find the inequalities of the set P = { x in R~{k-l} | Cx+d > } 
C = A ( : , 1 : k— 1 ) — repmat ( A ( : , k ) , 1 , k— 1 ) ; d = A(:,k); 
% 4. Draw P (outer polytope) and S (inner polytope) 
P = vertices (C, d) ; K = convhull (P ( 1 , : ) , P (2 , : ) ) ; 

figure; plot (P ( 1 , K) , P (2 , K) , ' ro ' ) ; hold on; plot(P(l,K), P(2,K),'r'); 

K = convhull (B (1, :) ,B (2, :)) ; plot (B ( 1 , : ) , B ( 2 , : ) , ' bo ' ) ; plot (B ( 1 , K) , B ( 2 , K) , ' b- ' ) ; 



% Compute a rank— k decomposition of M = AB such that columns of A and B sum to one 
function [A,B] = basesumtoone (M, k) 
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[u,s,v] = svds (M, k) ; A = u*s; B = v'; 

sA = sum (A) ; 

if min(sA) < le— 3 

A(:,2:k) =A(:,2:k) + repmat (A ( : , 1 ) , 1 , k-1) ; 

B(l,:) = B(l,:)- sum (B (2 :k, : ) ) ; 

sA = sum (A) ; 

A = A*diag (1 . /sA) ; B = diag(sA)*B; 

else 

A = A*diag ( 1 . /sA) ; B = diag(sA)*B; 

end 



% Find vertices V of the set P = { x in R~{k— 1} | Cx+d > } with brute force 
function V = vertices (C, d) ; 

[m, k] = size (C) ; 
V = [ ] ; IP = ; 

choices = nchoosek (1 : length (C (:, 1 )), k) ; 

% Choose two inequations of Cx+d > and compute the intersection 
for i = 1 : length ( choices (:, 1 ) ) 

if rank (C (choices (i, :),:) ) == k 

x = C (choices (i, :),:) \ [— d (choices (i, :))] ; 

% Check if the intersection is in P 

if min(C*x+d) > -le-9 && (IP == || min ( sum ( (V-repmat (x, 1 , IP ) ) . ~2 ) ) >le-6) 
V = [V x] ; IP = 1P + 1; 

end 

end 

end 

A. 2 3-D representation 

The following code has been used to generate Figures [21 and [7J 

% 3-D Representation of a nonnegative factorization of M = UV 

% with rank (M) = 3 and rank(U) = 4. Displays only the intermediate simplex 

% T and the set of (inner) points S, cf. Section 3.1. 

% [A,B,Bp] = Visualisation3D (M, U) 

% Input . 

% M>0 (m x n) : M is a rank 3. 

% U>0 (m x k) : U is a rank 4, and s.t. there exists V >0 : M = UV. 
% Output . 

% A (m x 4 ) , B ( 4 x n) : U = AB and columns of A and B sum to one. 
% Bp (4 x n) : M = ABp and columns of Bp sum to one. 

function [A,B,Bp] = Visualisation3D (M, U) 

if rank(U) / 4 | min (U ( : ) ) < || rank (M) ^ 3 | min (M ( : ) ) < 

disp('The matrix U (resp. M) is not rank 4 (resp. 3) or not nonnegative'); return; 

end 

% 0. Columns of M and U sum to one 

D = diag(l./sum(M) ) ; M = M*D ; Du = diag ( 1 . / sum (U) ) ; U = U*Du; 

% 1 . Compute U = AB 

[A,B] = basesumtoone (U, 4 ) ; 

% 2. Display the columns of B and draw T 

P = B; K = convhulln (P (1 : 3, : ) ' ) ; figure; 

for i = 1 : length (K (:, 1) ) 

plot 3 (P(l,K(i,l:2)),P(2,K(i,l:2)),P(3,K(i,l:2)),'m', ' linewidth' , 2) ; hold on; 

plot 3 (P(l,K(i,2:3)),P(2,K(i,2:3)),P(3,K(i,2:3)),'m', 'linewidth' , 2) ; 
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plot3 (P (1, K (i, [1 3] ) ) ,P (2,K(i, [1 3] ) ) ,P (3,K(i, [1 3 ] ) ) , ' m ' , ' linewidth ' , 2 ) ; 

end 

% 3. Compute and display the columns of Bp (M = ABp) and draw S 
Bp = A\M; K = convhull (Bp (1, : ) , Bp (2, : ) ) ; 

plot3 (Bp (1, K) , Bp(2,K), Bp (3, K) , 'bo' , ' linewidth' , 2) ; hold on; 
plot3 (Bp (1,K) , Bp(2,K), Bp (3, K) , 'b-' ,' linewidth' , 2) ; 
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